MEDB 5502, Module 14, review

Topics to be covered, 1 of 2

  • What you will learn
    • 01 Linear regresion, analysis of variance
    • 02 Linear regression with multiple independent variables
    • 03 Analysis of covariance
    • 04 Multi-factor analysis of variance
    • 05 Dimension reduction
    • 06 Logistic regression
    • 07 Diagnostic tests

Topics to be covered, 2 of 2

  • What you will learn
    • 08 Survival analysis
    • 09 Meta-analysis
    • 10 Dark side of data science
    • 11 Hierarchical models
    • 12 Longitudinal data
    • 13 Bayesian statistics

Module 01, Review

  • Simple linear regression
  • One factor analysis of variance

Module 01, SPSS scatterplot

Module 01, SPSS boxplot

Module 01, SPSS calculation of R Square

Module 01, SPSS ANOVA table

Module 01, SPSS linear regression coefficients

Break #1

  • What you have learned
    • 01 Linear regresion, analysis of variance
  • What’s coming next
    • 02 Linear regression with multiple independent variables

Module 02, Linear regression with multiple independent variables

  • Analysis of variance table
    • R-squared
    • Partial F tests
  • Stepwise regression
  • Interpretation
  • Collinearity
  • Mediation

Module 02, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances, Non-linearity
    • Residual scatterplot

Module 02, SPSS dialog box for the general linear model

Module 02, SPSS computation of R-squared

  • 10,548.480/15,079.017 = 0.70

Module 02, SPSS computation of change in R-squared

  • \(Partial\ R^2=0.700-0.693=0.007\)

Module 02, SPSS computation of partial F-test

Module 02, SPSS computation of full regression model

Module 02, SPSS computation of collinearity statistics

Module 2, What is mediation?

  • “A situation when the relationship between a predictor variable and an outcome variable can be explained by their relationship to a third variable (the mediator)”
    • Andy Field, Section 11.4

Module 2, SPSS assessment of mediation

Module 02, SPSS Q-Q plot

Module 02, SPSS Scatterplot, 1 of 4

Module 02, SPSS Scatterplot, 2 of 4

Module 02, SPSS Scatterplot, 3 of 4

Module 02, SPSS Scatterplot, 4 of 4

Break #2

  • What you have learned
    • 02 Linear regression with multiple independent variables
  • What’s coming next
    • 03 Analysis of covariance

Module 03, Analysis of covariance

  • Confounding/covariate imbalance
  • Interpretation
  • Interactions

Module 03, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances, Non-linearity
    • Residual scatterplots

Module 03, SPSS calculation of unadjusted estimates

Module 03, SPSS calculation of adjusted estimates

Module 03, SPSS visualization, 1 of 2

Module 03, SPSS visualization, 2 of 2

Module 03, SPSS Q-Q plot

Module 03, SPSS scatterplot

Module 03, SPSS interaction test

Break #3

  • What you have learned
    • 03 Analysis of covariance
  • What’s coming next
    • 04 Multi-factor analysis of variance

Module 04, Multi-factor analysis of variance

  • Tukey post hoc test
  • Interaction

Module 04, Checking assumptions

  • Non-normality
    • Q-Q plot of residuals
  • Lack of independence
    • Assessed qualitatively
  • Unequal variances
    • Boxplots

Module 04, SPSS crosstabulation

Module 04, SPSS analysis of variance table

Module 04, SPSS removing irrelevant rows

Module 04, SPSS parameter estimates

Module 04, SPSS Tukey test

Module 04, SPSS Q-Q plot

Module 04, SPSS scatterplot

Module 04, SPSS, Box plots of exercise data

Module 04, SPSS, Mean values for the interaction

Module 04, SPSS, Analysis of variance table for interaction model

Module 04, SPSS, Parameter estimates for the interaction model

Module 04, SPSS, Interaction plot, 1 of 2

Module 04, SPSS, Interaction plot, 2 of 2

Module 04, When you can’t estimate an interaction

  • Special case, n=1
    • Only one observation for categorical combination

Module 04, SPSS, Example, full moon study, 1 of 2

Module 04, SPSS, Example, full moon study, 2 of 2

Module 04, SPSS, Interaction between exercise program and hours spent exercising

Module 04, SPSS, Testing for interaction in analysis of covariance

Module 04, SPSS, Table with irrelevant rows removed

Module 04, SPSS, Parameter estimates

  • Intercept for prog=1, -8.997 + 2.216 = -6.781
  • Intercept for prog=2, 9.993 + 2.216 = 12.209
  • Intercept for prog=3, 2.216
  • Slope for prog=1, 10.409 + -2.956 = 7.453
  • Slope for prog=2, 9.83 + -2.956 = 6.874
  • Slope for prog=3, -2.956

Module 04, SPSS, Analysis of variance table

Module 04, SPSS, Table of means

Module 04, SPSS, Centered analysis

Module 04, Weight loss at various conditions

  • hours = 2 (mean), effort = 30 (mean),
    • \(\hat Y\) = 10.005
  • hours = 4 (mean+2), effort = 30 (mean),
    • \(\hat y\) = 10.005 + 2.291*2 = 14.587
  • hours = 2 (mean), effort = 40 (mean+20)
    • \(\hat Y\) = 10.005 + 0.707*20 = 24.145
  • hours = 4 (mean+2), effort = 40 (mean+20)
    • \(\hat Y\) = 10.005 + 2.291*2 + 0.707*20 + 0.393*2*20 = 44.447

Module 04, SPSS, Line plots of means for unbalanced data

Module 04, SPSS, Table of means

Module 04, SPSS, Table of frequencies and column percentages

Break #4

  • What you have learned
    • 04 Multi-factor analysis of variance
  • What’s coming next
    • 05 Dimension reduction

Module 05, Dimension reduction

  • Principal components analysis
    • Eigenvectors, Eigenvalues
  • Factor analysis
    • Factor rotation

Module 05, SPSS, Correlation matrix, 1 of 3

Module 05, SPSS, Correlation matrix, 2 of 3

Module 05, SPSS, Correlation matrix, 3 of 3

Module 05, SPSS, Communalities

Module 05, SPSS, Eigenvalues

Module 05, SPSS, Scree plot

Module 05, SPSS, Component matrix

Module 05, SPSS, Boxplots of first four principal components

Module 05, SPSS, Scatterplot of first four principal components

Module 05, SPSS, R-squared using four principal components

Module 05, SPSS, R-squared using all 24 variables

Module 05, SPSS, Rotated factor pattern, 1 of 3

Module 05, SPSS, Rotated factor pattern, 2 of 3

Module 05, SPSS, Rotated factor pattern, 3 of 3

Break #5

  • What you have learned
    • 05 Dimension reduction
  • What’s coming next
    • 06 Logistic regression

Module 06, Precursors to logistic regression

  • Test of two proportions
  • Chi-square test of independence
  • Odds ratio versus relative risk

Module 06, Logistic regression

  • Linear on log odds scale
  • Assumptions
    • Independence
    • Linearity

Module 06, SPSS, Confidence interval and test of hypothesis

Module 06, SPSS, Example: Titanic survival by sex

  • Moderate or large sample size: Pearson Chi-Square
  • Small sample size: Fisher’s Exact test

Module 06, SPSS, An example of a log odds model with real data, 1 of 3

Module 06, SPSS, An example of a log odds model with real data, 2 of 3

  • log odds = -16.72 + 0.577 \(\times\) GA

Module 06, SPSS, An example of a log odds model with real data, 3 of 3

  • log odds = -16.72 + 0.577 \(\times\) 30 = 0.59
  • odds = exp(log odds) = 1.8
  • prob = odds / (1+odds) = 0.64

Module 06, SPSS, Categorical variables in a logistic regression model, 1 of 3

  • 1st class odds: 129/193 = 0.67 or 193/129 = 1.5
  • 2nd class odds: 161/119 = 1.35 or 119/161 = 0.74
  • 3rd class odds: 573/138 = 4.15 or 138/573 = 0.24

Module 06, SPSS, Categorical variables in a logistic regression model, 2 of 3

  • 1.50 / 0.24 = 6.212
  • 0.74 / 0.24 = 3.069

Module 06, SPSS, Categorical variables in a logistic regression model, 3 of 3

  • 0.74 / 1.50 = 0.494
  • 0.24 / 1.50 = 0.161

Module 06, SPSS, Odds ratios for first class

Module 06, SPSS, Odds ratio for second class

Module 06, SPSS, Odds ratio for third class

Module 06, SPSS, Logistic regression with interaction

  • Odds ratio for 3rd class = 4.608
  • Odds ratio for 1st class = 4.608 \(\times\) 6.572 = 30.2
  • Odds ratio for 2nd class = 4.608 \(\times\) 9.289 = 42.8

Module 06, SPSS, Line plot for interaction, 1 of 2

Module 06, SPSS, Line plot for interaction, 2 of 2

Module 06, SPSS, Creating a binary outcome

Module 06, SPSS, Crosstabulation of predictor and outcome

Module 06, SPSS, Unadjusted odds ratio

Module 06, SPSS, Adjusted odds ratio

Break #6

  • What you have learned
    • 06 Logistic regression
  • What’s coming next
    • 07 Diagnostic tests

Module 07, Diagnostic tests

  • Sensitivty, specificity
    • SpPin, SnNout
  • Positive/negative predictive value
  • Likelihood ratio
  • ROC curve

Break #7

  • What you have learned
    • 07 Diagnostic tests
  • What’s coming next
    • 08 Survival analysis

Module 08, Basic survival analysis

  • Censoring
  • Kaplan-Meier curve
  • Log rank test

Module 08, Cox regression

  • Define hazard function
    • Increasing/decreasing/constant hazard
    • Hazard ratio
  • Assumptions
    • Independence
    • Non-informative censoring

Module 08, SPSS, Event count

Module 08, SPSS, Overall Kaplan-Meier curve

Module 08, SPSS, Event count by gender

Module 08, SPSS, Histogram of ages

Module 08, SPSS, Quality check of age group coding

Module 08, SPSS, Event count by age group, 1 of 2

Module 08, SPSS, Event count by age group, 2 of 2

Module 08, SPSS, Kaplan-Meier analysis by gender, 1 of 3

Module 08, SPSS, Kaplan-Meier analysis by gender, 2 of 3

Module 08, SPSS, Kaplan-Meier analysis by gender, 3 of 3

Module 08, SPSS, Kaplan-Meier analysis by age group, 1 of 3

Module 08, SPSS, Kaplan-Meier analysis by age group, 2 of 3

Module 08, SPSS, Kaplan-Meier analysis by age group, 3 of 3

Module 08, SPSS, Mean ages for men and women

Module 08, SPSS, Unadjusted and adjusted Cox regression models for gender

Break #8

  • What you have learned
    • 08 Survival analysis
  • What’s coming next
    • 09 Meta-analysis

Module 09, Meta-analysis

  • Forest plot
  • Publication bias
    • Funnel plot
  • Heterogeneity
    • Cochran’s Q, I-squared

Module 09, SPSS, Vaccine results, 1 of 6

Module 09, SPSS, Vaccine results, 2 of 6

Module 09, SPSS, Vaccine results, 3 of 6

Module 09, SPSS, Vaccine results, 4 of 6

Module 09, SPSS, Vaccine results, 5 of 6

Module 09, SPSS, Vaccine results, 6 of 6

Break #9

  • What you have learned
    • 09 Meta-analysis
  • What’s coming next
    • 10 Dark side of data science

Module 10, Dark side of data science

  • Empiricism
  • Reification
  • Bias in data science

Break #10

  • What you have learned
    • 10 Dark side of data science
  • What’s coming next
    • 11 Hierarchical models

Module 11, Hierarchical models

  • Clustered data
  • Between and within cluster variation
    • Intraclass correlation

Module 11, Checking assumptions

  • Independence
    • Only between clusters
  • Normality
    • Within clusters
    • Between clusters

Module 11, SPSS, Variance components, 1 of 2

Module 11, SPSS, Variance components, 2 of 2

Break #11

  • What you have learned
    • 11 Hierarchical models
  • What’s coming next
    • 12 Longitudinal data

Module 12, Longitudinal data

  • Random intercepts model
  • Random slopes model

Module 12, Checking assumptions

  • Independence
    • Only between subjects
  • Normality
    • Residuals
    • Random intercepts/slopes
  • Linearity
    • Scatterplot of residuals

Module 12, SPSS, Wide format

Module 12, SPSS, Boxplots

Module 12, SPSS, Colors and patterns

Module 12, SPSS, Tall format

Module 12, SPSS, Alternate clustering of boxplots

Module 12, SPSS, Random intercepts analysis, 1 of 6

Module 12, SPSS, Random intercepts analysis, 2 of 6

Module 12, SPSS, Random intercepts analysis, 3 of 6

Module 12, SPSS, Random intercepts analysis, 4 of 6

Module 12, SPSS, Random intercepts analysis, 5 of 6

Module 12, SPSS, Random intercepts analysis, 6 of 6

Break #12

  • What you have learned
    • 12 Longitudinal data
  • What’s coming next
    • 13 Bayesian statistics

Module 13, Bayesian statistics

  • Prior
    • Flat or non-informative prior
  • Likelihood
  • Posterior

Summary, 1 of 2

  • What you have learned
    • 01 Linear regresion, analysis of variance
    • 02 Linear regression with multiple independent variables
    • 03 Analysis of covariance
    • 04 Multi-factor analysis of variance
    • 05 Dimension reduction
    • 06 Logistic regression
    • 07 Diagnostic tests

Summary, 2 of 2

  • What you have learned
    • 08 Survival analysis
    • 09 Meta-analysis
    • 10 Dark side of data science
    • 11 Hierarchical models
    • 12 Longitudinal data
    • 13 Bayesian statistics